Processing digital data prior to compression

ABSTRACT

A method includes receiving an original string of bits where each of the bits represents one of two possible logic levels. The string of bits also carries information. A new string is formed, based on the original string, which contains all of the information of the original string by using fewer bits of one of the logic levels.

TECHNICAL FIELD

[0001] This application relates to processing digital data prior tocompression.

BACKGROUND

[0002] Compression is useful, for example, to reduce the volume of bitstransferred on a communication line from one computer to another, and inthat way to reduce the time required for the transfer. The statisticalnature of a string of digital data imposes a fundamental limit, known asthe entropy rate, on the degree of compression that can be achieved.

DESCRIPTION OF DRAWINGS

[0003]FIG. 1 shows a block diagram of a computer.

[0004]FIG. 2 shows processing a string of bits prior to compression.

[0005]FIG. 3 shows a flow diagram of a pre-compression procedure.

[0006]FIG. 4 shows processing a string of bits after decompression.

[0007]FIG. 5 shows a flow diagram of a post-decompression procedure.

DESCRIPTION

[0008] As shown in FIG. 1(a), in some implementations, the entropy ratefor compressing a string of bits 20 can be approached by preprocessingthe string, prior to compression, into two bit strings A and B 30, 40that include fewer logic level 0 bits than does the original string. Inreducing the number of logic level 0 bits, the probability that aparticular bit has a logic level 1 bit can be made greater than theprobability of a logic level 0 bit. By increasing this probabilitydifference, the subsequent compression of bit string A 30 and bit stringB 40 can produce a compressed string that approaches the entropy rate.

[0009] Referring to FIG. 2, the original string of bits 20 may containany number (N) of bits, for example, as shown in FIG. 2(a). Each bit isrepresented by a square that is either black, for a logic level 0 bit,or white, for a logic level 1 bit. As shown in FIG. 2(b), bit string A30 and bit string B 40 are two sub-strings formed from the string ofbits 20. Bit string A 30 includes all blocks of consecutively positionedlogic 0 bits from the original string of bits 20 and they occupy thesame positions in bit string A as in the original bit string. Bit stringB 40 contains all non-consecutively positioned bits of logic level 0,also in their original positions. Bit string A 30 includes, in thisexample, a block of 7 consecutive logic 0 bits 280 from the originalstring of bits 20 and a block of 4 consecutive logic 0 bits 285 alsofrom the original string of bits 20. All other bits in bit string A aregiven logic level 1. Bit string B 40 also has a length of N bits andincludes, in this example, the two logic 0 bits 287, 289 that wereincluded in the original string of bits and were not positioned within ablock of consecutive logic 0 bits. All other bits in string B are givenlogic level 1.

[0010] The process for generating strings A and B is illustrated in FIG.3, beginning with a processing procedure (300) that starts (310) priorto compression. An original string of bits is received (320) by acomputer for processing into the two bit strings A and B having areduced number of logic level 0 bits. The original string of bits isseparated (330) into bit string A and bit string B.

[0011] Next, as illustrated in FIG. 2(c) those logic 1 bits in bitstring B 40 that have the same position as the blocks of consecutivelogic 0 bits in bit string A 30, are deleted from string B. By deletingthese bits in bit string B, bit string B is shortened to a length of N-Mbits, where M is the number of logic 0 bits contained in string A. Inthe example of FIG. 2(c), bit string B is shortened by M=11 bits.

[0012] As shown in FIG. 3, after deleting (340) the bits in bit stringB, all of the logic 0 bits in bit string A are inverted (350) to logiclevel 1, except for the logic 0 bits 282, 284 which define the edges ofthe blocks of logic 0 bits, which remain at a logic level 0. Thus, asshown in FIG. 2(d), the only logic 0 bits in bit string A 30 are thebits that define the starting 282 and ending 284 bits of the blocks 280,285 of logic 0 bits.

[0013] By reducing the number of logic 0 bits in bit string A 30 and bitstring B 40, the probability that a logic level 0 occurs at anyparticular bit is smaller than the probability of a logic level 1occurring at that particular bit. By increasing the difference of theprobability of a logic level 1 and a logic level 0, the number of bitsrequired to compress bit string A and bit string B is closer to thetheoretic compression length, the entropy rate. By approaching theentropy rate, the fewer bits required for compression correspond tofaster transfer periods of the compressed bit strings.

[0014] Returning to FIG. 3, after the logic 0 bits of bit string A havebeen inverted (350), except for the block start and end bits, theprocedure (300) passes (360) bit string A 30 and bit string B to anytypical procedure for compressing the two bit strings prior to ending(370). For example, bit string A 30 and bit string B 40 may beconcatenated into a single bit string, of length N+N, and compressed,for example, by a Huffman compression technique. Because the bits in thetwo strings are mostly logic level 1, the compression can get muchcloser to the entropy rate than would typically be true for compressionof the original string.

[0015] Referring to FIG. 4, the original string of bits 20 may berestored by reversing the process illustrated in FIG. 2. Afterdecompressing and de-concatenating the two sub-strings, bit string A 30and bit string B 40, as shown in FIG. 4(a), are identical in length andmake-up to the bit strings shown in FIG. 2(d). Similar to FIG. 2, blacksquares still represent logic 0 bits and white squares represent logic 1bits. As shown in FIG. 4(b), the logic 1 bits between the starting 282and ending 284 bits are inverted from logic level 1 to logic level 0 andform the blocks of 7 consecutively positioned logic 0 bits 280 and fourconsecutively positioned logic 0 bits 285.

[0016] The process for restoring the original string of bits 20 isillustrated in FIG. 5, beginning with a processing procedure (500) thatstarts (510) after the bit string A 30 and bit string B 40 have beendecompressed and de-concatenated. Bit string A and bit string B arereceived (520), for example, by a computer for processing into theoriginal string of bits 20. The logic 1 bits between the starting andending bits are inverted (530) returning the blocks of consecutivelypositioned blocks of logic 0 bits to bit string A.

[0017] Next, as illustrated in FIG. 4(c) logic 1 bits are appended tobit string B 40 in positions corresponding the blocks of logic 0 bits280, 285 in bit string A 30. In this example bit string B returns to alength of N bits by appending the 11 logic 1 bits that were deleted inFIG. 2(c).

[0018] As shown in FIG. 5, after appending (540) the logic 1 bits to bitstring B, both bit strings are combined (550) by logically summing eachbit pair in the same position of each bit string. Thus, as illustratedin FIG. 4(d), combining bit strings A and B results in an N lengthstring of bits 20 that is a replica of the sting of bits 20 shown inFIG. 2(a).

[0019] Returning to FIG. 5, after the two bit strings are combined, thereplica string of bits is passed (560), for example, to further processthe binary information storied in the replica string of bits 20 prior tothe procedure ending (570).

[0020] Returning to FIG. 1(a), the processing of the original string ofbits 20 is done in hardware and software that includes an input port 90,included in computer 10, where the original string of bits 20 isreceived. The received string is stored a memory 60. The memory 60 alsoincludes software 100 for processing the string of bits 20 into bitstring A 30 and bit string B 40. The software 100 may also includeinstructions to compress the two bit strings 30, 40, into a compressedstring of bits 70, which is also stored in the memory 60. Aftercompressing, the compressed string of bits 70 may be transferred fromthe memory 60 through an output port 110 to other computers or otherdevices. Computer 10 also includes a processor 50 that executes thesoftware 100 instructions and operating system 120 instructions, alsostored in the memory 60.

[0021] Referring to FIG. 1(b), the compressed string of bits 70 may bereceived through an input port 210, included in another computer 200,for decompressing and further processing. By transferring the compressedstring of bits 70 from computer 10 to computer 200, the number of bitstransferred is reduced along with the transfer period. The compressedstring of bits 70 may be stored in a memory 220, included in computer200, which also stores software 230 to decompress the compressed stringof bits 70 and process the recovered bit strings A 30 and B 40 into areplica of the original string of bits 20. A processor 240 may executethe instructions of software 230 for decompressing and processing of thedigital data. After decompressing and processing, the string of bits 20may be transferred from computer 200 via an output port 250 by executinginstructions stored in an operating system 260 also stored in the memory220. The string of bits 20 may also remain in the memory 220 for furtherprocessing on computer 200.

[0022] Although some implementation examples have been discussed about,other implementations are also within the scope of the following claims.

[0023] For example, in the implementation discussed in conjunction withFIG. 1, computers 10 and 200 process the string of bits 20. However,other types of digital devices, such as cellular telephones, personaldigital assistants (PDA), pagers, or other similar digital devices maybe used to process the string of bits 20. These digital devices may alsobe used individually, or in combination, to process the string of bits20.

[0024] Also in conjunction with FIG. 1, various devices may input andoutput the bit strings A 30 and B 40. Input ports 90 and 210 and outputports 110 and 250 are one example. In other examples, keyboards,diskettes, compact disc read only memories (CD-ROM), or Ethernetconnections can input and output the bit strings. Also video displays,printers, or other peripherals may output the bit strings from thecomputers.

[0025] In conjunction with FIGS. 2-5, processing procedures (300) and(500) operated on blocks of logic 0 bits. However, processing procedures(300) and (500) may also be configured to operate on blocks of logic 1bits. Other discrete logic representations may also be utilized by theprocessing procedures (300) and (500).

[0026] In the examples described above, the original strings of bits 20and bit strings A 30 and B 40 were processed, compressed, transferred,decompressed, and reprocessed by computer 10 and computer 200. However,other types of digital data may be transferred between the computers.For example, digital data files, streams of digital data, or othersimilar digital data may transfer between the computers.

[0027] The procedure (300), described in conjunction with FIGS. 2 and 3,and procedure (500), described in conjunction with FIGS. 4 and 5, arenot limited to any particular hardware or software configuration; theymay find applicability in any computing or processing environment.Procedures (300) and (500) may be implemented in hardware, software, orany combination of the two. Procedure (300) and (500) may be implementedin computer programs executing on machines (e.g., programmablecomputers) that each include a processor, a machine-readable mediumreadable by the processor (including volatile and non-volatile memoryand/or storage elements), at least one input device, and one or moreoutput devices. Procedure (300) and (500) may also be implemented in anapplication specific integrated circuit (ASIC). Program code may beapplied to the string of bits 20, received by the computer 10 andcomputer 200, in conjunction with FIG. 1, to perform procedure (300), orprocedure (500), or to generate output information. The outputinformation may be applied to one or more devices, such as the outputports 110 and 250.

[0028] Each computer program may be implemented in a high-levelprocedural or object-oriented programming language to communicate with acomputer system. However, the computer programs can be implemented inassembly or machine language, if desired. In any case, the language maybe a compiled or interpreted language.

[0029] Each computer program may be stored on a machine-readable mediumor device, e.g., random access memory (RAM), read only memory (ROM),compact disc read only memory (CD-ROM), hard disk drive, magneticdiskette, or similar medium or device, that is readable by a machine(e.g., a general or special purpose programmable computer) forconfiguring and operating the machine when the readable medium or deviceis read by the machine to perform procedure (300) and procedure (500).Procedure (300) and procedure (500) may also be implemented as amachine-readable storage medium, configured with a computer program,where, upon execution, instructions in the computer program cause themachine to operate in accordance with procedure (300) and procedure(500).

[0030] Procedure (300) may operate on one computer while procedure (500)may operate on a separate computer.

What is claimed is:
 1. A method comprising: receiving an original stringof bits, each of the bits representing one or the other of two logicallevels, the string of bits carrying information; and forming a newstring based on the original string, the new string carrying all of theinformation and including fewer bits of one of the logical levels thanwere included in the original string.
 2. The method of claim 1 in whichthe new string is longer than the original string and shorter than twicethe length of the original string.
 3. The method of claim 1 in which thenew string comprises two sub-strings formed based on the originalstring.
 4. The method of claim 3 in which the two sub-strings comprise afirst bit string and a second bit string, the first bit string includingat least two consecutively positioned bits having a first logic leveland the second bit string including remaining non-consecutivelypositioned bits having the first logic level, the at least twoconsecutively positioned bits defined by a start bit and an end bit. 5.The method of claim 4 in which the forming includes deleting at leasttwo bits from the second bit string corresponding to the at least twoconsecutively positioned bits included in the first bit string, andinverting the at least two consecutively positioned bits in the firstbit string to a second logic level, the start bit and the end bit arenot inverted.
 6. The method of claim 1, further comprising, compressingthe new string.
 7. A method comprising: receiving a string of bits, eachof the bits representing one or the other of two logical levels, thestring of bits carrying fewer bits of one of the logical levels, thestring of bits carrying information; and forming a new string, the newstring carrying all of the information and including more bits of one ofthe logical levels than were included in the string of bits.
 8. Themethod of claim 7 in which the new string is shorter than the receivedstring.
 9. The method of claim 7 in which the received string comprisestwo sub-strings formed based on an original string.
 10. The method ofclaim 9 in which the two sub-strings comprise a first bit string and asecond bit string, the first bit string including at least twoconsecutively positioned bits having a first logic level defined by astart bit and an end bit.
 11. The method of claim 10, furthercomprising, inverting the at least two consecutively positioned bits inthe first bit string to a second logic level, the start bit and end bitnot being inverted.
 12. A digital device comprising: a processorconfigured to execute instructions; and a memory storing instructionscapable of causing the processor to, receive an original string of bits,each of the bits representing one or the other of two logical levels,the bits of the string carrying information, and form a new string basedon the original string, the new string carrying all of the informationand including fewer bits of one of the logical levels than were includedin the original string.
 13. The digital device of claim 12 in which thenew string is longer than the original string and shorter than twice thelength of the original string.
 14. The digital device of claim 12 inwhich the new string comprises two sub-strings formed based on theoriginal string.
 15. A digital device comprising: a processor configuredto execute instructions; and a memory storing instructions capable ofcausing the processor to, receive a string of bits, each of the bitsrepresenting one or the other of two logical levels, the string of bitscarrying fewer bits of one of the logical levels, the string of bitscarrying information, and form a new string, the new string carrying allof the information and including more bits of one of the logical levelsthan were included in the string of bits.
 16. The digital device ofclaim 15 in which the new string is shorter than the received string.17. The digital device of claim 15 in which the received string of bitscomprises two sub-strings.
 18. An article comprising a machine-readablemedium that stores instructions capable of causing a digital device to:receive an original string of bits, each of the bits representing one orthe other of two logical levels, the string of bits carryinginformation; and form a new string based on the original string, the newstring carrying all of the information and including fewer bits of oneof the logical levels than were included in the original string.
 19. Themachine-readable medium of claim 18 in which the new string is longerthan the original string and shorter than twice the length of theoriginal string.
 20. The machine-readable medium of claim 18 in whichthe new string comprises two sub-strings formed based on the originalstring.
 21. An article comprising a machine-readable medium that storesinstructions capable of causing a digital device to: receive a string ofbits, each of the bits representing one or the other of two logicallevels, the string of bits carrying fewer bits of one of the logicallevels, the string of bits carrying information; and form a new string,the new string carrying all of the information and including more bitsof one of the logical levels than were included in the string of bits.22. The machine-readable medium of claim 21 in which the new string isshorter than the received string.
 23. The machine-readable medium ofclaim 21 in which the received string comprises two sub-strings.
 24. Amethod comprising: receiving an original string of bits, each of thebits representing one or the other or two logical levels, the string ofbits carrying information; and forming a first bit string and a secondbit string based on the original string, the first bit string includingat least two consecutively positioned bits having a first logic leveland the second bit string including remaining non-consecutivelypositioned bits having the first logic level, the at least twoconsecutively positioned bits defined by a start bit and an end bit. 25.The method of claim 24, further comprising: deleting at least two bitsfrom the second bit string corresponding to the at least twoconsecutively positioned bits included in the first bit string;inverting the at least two consecutively positioned bits in the firstbit string to a second logic level, the start bit and the end bit notbeing inverted; and compressing the new string.
 26. The method of claim25, further comprising: decompressing the new string into the first andsecond bit string; and inverting at least two consecutively positionedbits in the decompressed first bit string to the first logic level, thestart bit and the end bit not being inverted.
 27. The method of claim26, further comprising: appending at least two bits to the decompressedsecond bit string corresponding to the at least two consecutivelypositioned bits having the first logic level included in thedecompressed first bit string; and combining the decompressed first bitstring and the decompressed second bit string.
 28. A system comprising:a source of a string of bits; and a digital device configured to,receive the strings of bits from the source, each of the bitsrepresenting one or the other of two logical levels, the string of bitscarrying information, and form a new string based on the originalstring, the new string carrying all of the information and includingfewer bits of one of the logical levels than were included in theoriginal string.
 29. The system of claim 28 in which the new string islonger than the original string and shorter than twice the length of theoriginal string.
 30. The system of claim 28 in which the new stringcomprises two sub-strings based on the original string.